Choosing the Right Translation: A Syntactically Informed Classification Approach

نویسندگان

  • Simon Zwarts
  • Mark Dras
چکیده

One style of Multi-Engine Machine Translation architecture involves choosing the best of a set of outputs from different systems. Choosing the best translation from an arbitrary set, even in the presence of human references, is a difficult problem; it may prove better to look at mechanisms for making such choices in more restricted contexts. In this paper we take a classificationbased approach to choosing between candidates from syntactically informed translations. The idea is that using multiple parsers as part of a classifier could help detect syntactic problems in this context that lead to bad translations; these problems could be detected on either the source side—perhaps sentences with difficult or incorrect parses could lead to bad translations—or on the target side—perhaps the output quality could be measured in a more syntactically informed way, looking for syntactic abnormalities. We show that there is no evidence that the source side information is useful. However, a target-side classifier, when used to identify particularly bad translation candidates, can lead to significant improvements in Bleu score. Improvements are even greater when combined with existing language and alignment model approaches. c © 2008. Licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported license (http://creativecommons.org/licenses/by-ncsa/3.0/). Some rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Is getting the right answer just about choosing the right words? The role of syntactically-informed features in short answer scoring

Developments in the educational landscape have spurred greater interest in the problem of automatically scoring short answer questions. A recent shared task on this topic revealed a fundamental divide in the modeling approaches that have been applied to this problem, with the best-performing systems split between those that employ a knowledge engineering approach and those that almost solely le...

متن کامل

Dual-Path Phrase-Based Statistical Machine Translation

Preceding a phrase-based statistical machine translation (PSMT) system by a syntactically-informed reordering preprocessing step has been shown to improve overall translation performance compared to a baseline PSMT system. However, the improvement is not seen for every sentence. We use a lattice input to a PSMT system in order to translate simultaneously across both original and reordered versi...

متن کامل

The impact of parse quality on syntactically-informed statistical machine translation

We investigate the impact of parse quality on a syntactically-informed statistical machine translation system applied to technical text. We vary parse quality by varying the amount of data used to train the parser. As the amount of data increases, parse quality improves, leading to improvements in machine translation output and results that significantly outperform a state-of-the-art phrasal ba...

متن کامل

Semantic Prosody: Its Knowledge and Appropriate Selection of Equivalents

In translation, choosing appropriate equivalent is essential to convey the right message from source-text to target-text, and one of the issues that may have a determinative role in appropriate equivalent choice is the semantic prosody (SP) behavior of words and the relation existing between the SP of a word and semantic senses (i.e. negativity, positivity or neutrality) of its collocations in ...

متن کامل

Semantic Prosody: Its Knowledge and Appropriate Selection of Equivalents

In translation, choosing appropriate equivalent is essential to convey the right message from source-text to target-text, and one of the issues that may have a determinative role in appropriate equivalent choice is the semantic prosody (SP) behavior of words and the relation existing between the SP of a word and semantic senses (i.e. negativity, positivity or neutrality) of its collocations in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008